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METHOD AND APPARATUS FOR OMNIDIRECTIONAL IMAGING 



[0001] This application claims the benefit of U.S. Provisional Appln. No. 60/193,246, 
filed March 30, 2000, which is a continuation-in-part of co-pending U.S. Appln. No. 
09/098,322, filed June 16, 1998, the disclosures of which are incorporated by 
reference herein in their entirety. 



[0002] The invention is related to omnidirectional imaging and transmission, and 
more particularly to an omnidirectional imaging method and apparatus that obtains 
images over an entire hemispherical field of view simultaneously and a corresponding 
image viewing scheme. 



[0003] A number of approaches had been proposed for imaging systems that attempt 
to achieve a wide field-of-view (FOV). Most existing imaging systems employ 
electronic sensor chips, or still photographic film, in a camera to record optical 
images collected by the camera's optical lens system. The image projection for most 
camera lenses is modeled as a "pin-hole" with a single center of projection. Because 
the sizes of the camera lens and the imaging sensor have their practical limitations, 
the light rays that can be collected by a camera lens and received by the imaging 
device typically form a cone with a very small opening angle. Therefore, angular 
field-of-views for conventional cameras are within a limited range ranging from 5 to 
50 degrees. This limited range makes conventional cameras unsuitable for achieving 
a wide FOV, as can be seen in Figure 1 . 

[0004] Wide-viewing-angle lens systems, such as fish-eye lenses, are designed to 
have a very short focal length which, when used in place of conventional camera lens, 
enables the camera to view objects at much wider angle to obtain a panoramic view, 
as shown in Figure 1 . In general, to widen the FOV, the design of the fish-eye lens is 
made more complicated. As a result, obtaining a hemispherical FOV would require 
the fish-eye lens to have overly large dimensions and a complex, expensive optical 
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design. Further, it is very difficult to design a fish-eye lens that conforms to a single 
viewpoint constraint, where all incoming principal light rays intersect at a single point 
to form a fixed viewpoint, to minimize or eliminate distortion. The fish-eye lens 
allows a statically positioned camera to acquire a wider angle of view than a 
conventional camera, as shown in Figure 1 . However, the nonlinear properties 
resulting from the semi-spherical optical lens mapping make the resolution along the 
circular boundary of the image very poor. This is problematic if the field of view 
corresponding to the circular boundary of the image represents an area, such as a 
ground or floor, where high image resolution is desired. Although the images 
acquired by fish-eye lenses may be adequate for certain low-precision visualization 
applications, these lenses still do not provide adequate distortion compensation. The 
high cost of the lenses as well as the distortion problem prevent the fish-eye lens from 
widespread application. 

[0005] To remedy the problems presented by the fish-eye lenses, large FOVs may be 
obtained by using multiple cameras in the same system, each camera pointing in a 
different direction. However, seamless integration of multiple images is further 
complicated by the fact that the images produced by each camera each has a different 
center of projection. Another possible solution for increasing the FOV of an imaging 
system is to rotate the entire imaging system about its center of projection, thereby 
obtaining a sequence of images that are acquired at different camera positions to be 
joined together to obtain a panoramic view of the scene. Rotating imaging systems, 
however, require the use of moving parts and precision positioning devices, making 
them cumbersome and expensive. A more serious drawback is that rotating imaging 
systems cannot obtain multiple images with wide FOV simultaneously. In both the 
multiple camera and rotating camera systems, obtaining complete wide FOV images 
can require an extended period of time, making these systems inappropriate for 
applications requiring real-time imaging of moving objects. Further, none of the 
above-described systems can generate three-dimensional (3D) omnidirectional 
images. 

[0006] There is a long- felt need for an omnidirectional imaging system that can 
obtain 3D omnidirectional images in real-time without encountering the 
disadvantages of the systems described above. 
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Summary of the Invention 
[0007] Accordingly, the present invention is directed to an efficient omnidirectional 
image processing method and system that can obtain, in real-time, non-distorted 
perspective and panoramic images and videos based on the real-time omnidirectional 
images acquired by omnidirectional image sensors. Instead of solving complex high- 
order nonlinear equations via computation, the invention uses a mapping matrix to 
define a relationship between pixels in a user-defined perspective or panoramic 
viewing window and pixel locations on the original omnidirectional image source so 
that the computation of the non-distorted images can be performed in real-time at a 
video rate (e.g., 30 frames per second). This mapping matrix scheme facilitates the 
hardware implementation of the omnidirectional imaging algorithms. 
[0008] In one embodiment, the invention also includes a change/motion detection 
method using omnidirectional sequential images directly from the omnidirectional 
image source. Once a change is detected on an omnidirectional image, direction and 
configuration parameters (e.g., zoom, pan, and tilt) of a perspective viewing window 
can be automatically determined. The omnidirectional imaging method and apparatus 
of the invention can therefore offer unique solutions to many practical systems that 
require a simultaneous 360 degree viewing angle and three dimensional measurement 
capability. 

Brief Description of the Drawings 
[0009] Figure 1 is a diagram comparing the fields of view between a conventional 
camera, a panoramic camera, and an omnidirectional camera; 

[0010] Figures 2a, 2b and 2c are examples of various reflective convex mirrors used 
in omnidirectional imaging; 

[0011] Figure 3 illustrates one manner in which an omnidirectional image is obtained 
from a convex mirror having a single virtual viewpoint; 

[0012] Figure 4 illustrates the manner in which one embodiment of the invention 
creates a mapping matrix; 

[0013] Figures 5a and 5b illustrate configuration parameters of a perspective viewing 
window; 
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[0014] Figure 6 is a block diagram illustrating a process for establishing a mapping 
matrix according to one embodiment of the present invention; 
[0015] Figure 7 is a representative diagram illustrating the relationship between a 
user-defined panoramic viewing window and corresponding pixel values; 
[0016] Figure 8 is a block diagram illustrating a change/motion scheme using 
omnidirectional images; 

[0017] Figure 9 is a diagram illustrating one way in which a direction of a desired 
area is calculated and automatically focused based on the process shown in Figure 8; 
[0018] Figure 10 is a perspective view of a voice-directed omnidirectional camera to 
be used in the present invention; 

[0019] Figure 1 1 is a block diagram illustrating a voice directed perspective viewing 
process; 

[0020] Figure 12 is a block diagram of the inventive system incorporating an internet 
transmission scheme; 

[0021] Figure 13 is a representative diagram of Internet communication server 
architecture according to the inventive system; 

[0022] Figure 14 is a representative diagram of a server topology used in the present 
invention; 

[0023] Figures 15a and 15b are flowcharts of server programs used in the present 
invention; and 

[0024] Figure 16 is a representative diagram of the invention used in a two-way 
communication system. 

Description of the Preferred Embodiments 
[0025] To dramatically increase the field of view of an imaging system, the present 
invention employs a reflective surface (i.e., convex mirror) to obtain an 
omnidirectional image. In particular, the field of view of a video camera can be 
greatly increased by using a reflective surface with a properly designed surface shape 
that provides a greater field of view than a flat reflective surface. There are a number 
of surface profiles that can be used to produce an omnidirectional FOV. Figures 2a, 
2b, and 2c illustrate several examples of convex reflective surfaces that provide 
increased FOV, such as a conic mirror, spherical mirror, and parabolic mirror, 
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respectively. The optical geometry of these convex mirrors provides a simple and 
effective means to convert a video camera's planar view into an omnidirectional view 
around the vertical axis of these mirrors without using any moving parts. 
[0026] Figures 2a through 2c appear to indicate that any convex mirror can be used 
for omnidirectional imaging; however, a satisfactory imaging system according to the 
invention must meet two requirements. First, the system must create a one-to-one 
geometric correspondence between pixels in an image and points in the scene. 
Second, the convex mirror should conform to a "single viewpoint constraint"; that is, 
each pixel in the image corresponds to a particular viewing direction defined by a ray 
from that pixel on an image plane through a single viewing point such that all of the 
light rays are directed to a single virtual viewing point. Based on these two 
requirements, the convex mirrors shown in Figures 2a through 2c can increase the 
field of view but are not satisfactory imaging devices because the reflecting surfaces 
of the mirrors do not meet the single viewpoint constraint, which is desirable for a 
high-quality omnidirectional imaging system. 

[0027] The preferred design for a reflective surface used in the inventive system will 
now be described with reference to Figure 3. As noted above, the preferred reflective 
surface will cause all light rays reflected by the mirror to pass through a single virtual 
viewpoint, thereby meeting the single viewpoint constraint. By way of illustration, 
Figure 3 shows a video camera 30 having an image plane 31 on which images are 
captured and a regular lens 32 whose field of view preferably covers the entire 
reflecting surface of the mirror 34. Since the optical design of camera 30 and lens 32 
is rotationally symmetric, only the cross-sectional function z(r) defining the mirror 
surface cross-section profile needs to be determined. The actual mirror shape is 
generated by the revolution of the desired cross-section profile about its optical axis. 
The function of the mirror 34 is to reflect all viewing rays coming from the video 
camera's 30 focal point C to the surface of physical objects in the field of view. The 
key feature of this reflection is that all such reflected rays must have a projection 
towards a single virtual viewing point at the mirror's focal center, labeled as O. In 
other words, the mirror should effectively steer viewing rays such that the camera 30 
equivalently sees the objects in the world from a single viewpoint O. 
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[0028] A hyperbola is the preferred cross-sectional shape of the mirror 34 because a 
hyperbolic mirror will satisfy the geometric correspondence and single viewpoint 
constraint requirements of the system. More particularly, the extension of any ray 
reflected by the hyperbolic curve and originating from one of the curve's focal points 
passes through the curve's other focal point. If the mirror 34 has a hyperbolic profile, 
and a video camera 30 is placed at one of the hyperbolic curve's focal points C, as 
shown in Figure 3, the imaging system will have a single viewpoint at the curve's 
other focal point O. As a result, the system will act as if the video camera 30 were 
placed at the virtual viewing location O. 

[0029] The mathematical equation describing the hyperbolic mirror surface profile is: 



As a result, the unique reflecting surface of the mirror 34 causes the extension of the 
incoming light ray sensed by the camera 30 to always pass through a single virtual 
viewpoint O, regardless of the location of the projection point M on the mirror 
surface. 

[0030] The image obtained by the camera 30 and capture on the camera's image 
plane 31 will exhibit some distortion due to the non-planar reflecting surface of the 
mirror 34. To facilitate the real-time processing of the omnidirectional image, the 
inventive system uses an algorithm to map the pixels from the distorted 
omnidirectional image on the camera's image plane 31 onto a perspective window 
image 40 directly, once the configuration of the perspective or panoramic window is 
defined. As shown in Figure 4, a virtual perspective viewing window 40 can be 
arbitrarily defined in a three-dimensional space using three parameters: Zoom, Pan 
and Tilt (d,a,j3). Figures 5a and 5b illustrate the definition of these three parameters. 
More particularly, Zoom is defined as the distance of the perspective window plane W 
40 from the focal point of the mirror 34, Pan is defined as the angle between the angle 
D between the x-axis and the projection of the perspective window's W 40 normal 
vector onto the x-y plane, and Tilt is defined by the angle E between the x-y plane and 



(z + c) 2 = 
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the perspective window's W 40 normal vector. All of these parameters can be 
adjusted by the user. 

[0031] In addition to the Zoom, Pan and Tilt parameters (d, a, J3 ), the user can also 
adjust the dimensions of the pixel array (i.e., number of pixels) to be displayed in the 
perspective viewing window. Once the perspective viewing window W 40 is defined, 
the system can establish a mapping matrix that relates the pixels in the distorted 
omnidirectional image I(i j) to pixels W(p,q) in the user-defined perspective viewing 
window W 40 to form a non-distorted perspective image. The conversion from the 
distorted omnidirectional image into a non-distorted perspective image using a one-to- 
one pixel correspondence between the two images is unique. 
[0032] Figures 6 is a block diagram illustrating one method 60 for establishing a 
mapping matrix to convert the distorted omnidirectional image into the non-distorted 
perspective image in the perspective viewing window W. As noted above, the user 
first defines a perspective viewing window in three-dimensional space by specifying 
the Zoom, Pan and Tilt parameters at step 62 to specify the configuration of the 
perspective window. Providing this degree of flexibility facilitates the wide range 
selections of desirable viewing needs by the user. 

[0033] Once these parameters are defined, a mapping matrix can be generated based 
on the fixed geometric relationship of the imaging system. More particularly, a "ray 
tracing 1 * algorithm is applied for each pixel W(p,q) in the perspective viewing window 
to determine the corresponding unique reflection point M on the surface of the mirror 
at step 64, thereby obtaining a projection of each pixel in W onto the surface of the 
omni-mirror. In the ray tracing algorithm, a straight line from the pixel location on W 
denoted as W(p,q) to the focal center O of the omni-mirror is recorded as M(p,q), as 
illustrated in Figure 5. 

[0034] Once each perspective viewing window pixel is linked to a reflection point 
M(p,q), the system projects each reflection point M(p,q) back to the focal point of the 
imaging sensor and then determines the corresponding pixel location I(i j) on the 
sensor's image plane based on the geometric relationship between the camera and 
mirror at step 66. More particularly, the projection line from the M(p,q) to C would 
be intercepted by the image plane I at a pixel location of (i,j). The one-to-one 
mapping relationship therefore can be established between W(p,q) and I(i j) such that 
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for each pixel in the perspective viewing window W, there is a unique pixel location 
in the omnidirectional image that corresponds to the W(p,q), allowing the pixel values 
(e.g., RGB values) in the omnidirectional image to be used in the counterpart pixels in 
the perspective window. 

[0035] At step 68, a mapping matrix MAP is established to link each pixel in the 
perspective viewing window W with the corresponding pixel values in the 
omnidirectional image such that W(p,q) = MAP [I(i,j)]. The dimension of the 
mapping matrix MAP is the same as that of the pixel arrangement in the perspective 
viewing window W 40, and each cell of the mapping matrix stores two index values 
(i j) of the corresponding pixel in the omnidirectional image I at step 72. 
[0036] Once the mapping matrix MAP has been established, the real-time image- 
processing task is greatly simplified and can be conducted in a single step at step 70 
by applying the mapping matrix MAP to each pixel I(i j) in the omnidirectional image 
I to determine the pixel values for each corresponding pixel in the perspective 
viewing window W. Further, each time a new omnidirectional image I is acquired, a 
look-up table operation can be performed to generate the non-distorted perspective 
image for display in the perspective viewing window W at step 72. 
[0037] Referring now to Figure 7, the perspective viewing window in the inventive 
system can be a panoramic viewing window 74 with few modifications to the system. 
The image processing procedure using a panoramic viewing window 74 is very 
similar to the process described above with respect to the perspective viewing window 
40. As shown in Figure 7, a virtual panoramic viewing window 74 can be arbitrarily 
defined in three-dimensional space by a user using three parameters: Zoom and Tilt 
(d, /? ), subject to the only constraint that the normal of the window plane should point 
directly toward the focal center of the reflective mirror, as shown in Figure 7. In 
addition, to the Zoom and Tilt parameters (d,or,/?) 5 the user can also adjust the 
dimensions of the pixel array (e.g. the number of pixels) to be displayed in the 
panoramic viewing window 74. Once these parameters are defined, a mapping matrix 
can be generated based on the fixed geometric relationship of the imaging system in 
the same manner explained above with respect to Figure 6 to generate a non-distorted 
image in the panoramic viewing window 74. 
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[0038] Note that due to the non-linear geometric relationship between the perspective 
viewing window image W(p,q) and the omnidirectional image I(i ,j), the intercepting 
point of the back-projection of the reflection point M(p,q) may not correspond 
directly with any pixel position on the image plane. In such cases, the inventive 
system may use one of several alternative methods to obtain the pixel values for the 
perspective viewing window image W(p,q). One option is to use the pixel values of 
the closest neighborhood point in the omnidirectional image I without any 
interpolation by, for example, quantizing the calculated coordinate values into 
integers and using the integer values as the pixel values for the perspective viewing 
window pixel W(p,q). Although this method is the fastest way to obtain the pixel 
values, it does possess inherent quantization errors. 

[0039] A less error-prone method is to use linear interpolation to resolve the pixel 
values of calculated fractional coordinate values. For example, if the calculated 
coordinate value (i 0? jo) falls between the grid formed by (ij), (ij+l), (i+lj), and (i+1, 
j+1) the corresponding W(p,q) value can be obtained from the following linear 
interpolation formula: 



Yet another alternative is to use other types of interpolation schemes to enhance the 
fidelity of the converted images, such as average, quadratic interpolation, B-spline, 



[0040] The actual image-processing algorithms implemented for linking pixels from 
the omnidirectional image I(ij) to the perspective viewing window W(p,q) have been 
simplified by the inventive system from complicated high order non-linear equations 
to a simple table look-up function, as explained above with respect to Figure 6. Note 
that before the actual table look-up function is conducted, the parameter space needs 
to be partitioned into a finite number of configurations. In the case of perspective 
viewing window, the parameter space is three dimensional defined by the Zoom, Pan 
and Tilt diameters 



w{p, q) = Uo - J) • [0o - o • nu j) + 0" + 1 - / 0 ) • i a + 1, m 



+ 0>i-yo)*[('o-0*/(^y + i) + 0" + i-io*Ai + i,y + i)] 
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etc. 



80169-0031 (GNX-31) 



Patent Application 



(d,a,fi ,), while in the case of a panoramic viewing window, the parameter space is 
two dimensional defined only by the Zoom and Tilt parameters (d, a ). 
[0041] For each possible configuration in the parameter space, a mapping matrix is 
pre-calculated. The mapping matrix MAP having a dimension of (N, M) can be 
stored in the following format: 



0'i,i>Ai) 0*1,2,7*1,2) 0*1,3,7*1,3) ••• 0\,a/»Aa/) 

0*2,1 > 7*2,1 ) 0*2,2 ? 7*2,2) 0*2,3 > 72,3) "' 0*2,*/ > 7*2,A/ ) 

MAP= 0*3,1, 7*3,1 ) 0*3,2,7*3,2) 03,3,73,3) 03,A/,7*3,A/) 

0n,1>Jn,i) 0jV,2>7;V,2) 0^V,3,7^,3) *"* ( / yV,A/,7jV,A/)_ 



(3) 



[0042] All possible or desired mapping matrices MAP are pre-stored in a set of 
memory chips with the system, such as chips in the "display/memory/local control 
logic module 120 as shown in Figure 12, in a manner that is easily retrievable. Once 
a user selects a viewing window configuration, the stored MAP matrix is retrieved 
and used to compute the image of the viewing window: 

^ U J U ) 7 0l,2, 7*1,2 ) 7 0l,3, 7l, 3 ) 7 0l,A/,7l,A/) 
fOlA > 72,l) 7 0*2,2, 72,2) 7 0 2 ,3, 7*2,3) '•■ / 0*2,A/,7 2 ,A/) 
W= /0* 3 ,l>73,l) / 0 3 ,2,73,2) 7 0*3,3,73,3) 7 0*3, A/ » 7*3,M ) (4) 



7 0"am>7au) 7 ( 2 *W,2'7^,2) 7 ( Z A^,3'7yV,3) 



where I is the omnidirectional image. 

[0043] Note that the "display/memory/local control module" 120 shown in Figure 12 
is preferably designed to have a built-in memory, image display, user interface, and 
self contained structure such that it can operate without relying upon a separate PC. 
[0044] The present invention may also include a change/motion detection scheme 80 
based on frame subtraction, as illustrated in Figure 8. This feature is particularly 
useful in security applications. This particular embodiment conducts frame 
subtraction using sequential omnidirectional images directly instead of using 
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converted perspective images. The sequential omnidirectional images in the 
description below are denoted as Ii , I 2 , . . I n - As can be seen in Figure 8, the motion 
detection process first involves acquiring and storing a reference frame of an 
omnidirectional image, denoted as I 0 , at step 81. Next, a sequential omnidirectional 
image Ii is acquired at step 82 and a frame subtraction is calculated at step 83 as 
follows to obtain a residual image "DIFF": 



Once the residual image "DIFF" has been calculated, a smooth filter algorithm is 
applied to the residual image to eliminate any spike that may cause a false alarm. If 
any element in the residual image "DIFF" is still larger than a pre-set threshold value 
after the smoothing step at step 85, the element indicates the presence of, for example, 
an intruder or other anomaly. The system converts the area of the image around the 
suspected anomalous pixels into a non-distorted perspective image at step 86 for 
closer visual examination. More particularly, as shown in Figure 9, the direction of 
the suspected anomalous pixel area can be calculated and used as the parameters of 
the perspective viewing window W so that the perspective viewing window W is 
automatically focused on the suspected anomalous pixel area. An optional alarm can 
be activated at step 87 if the image in the perspective viewing window confirms the 
presence of suspicious or undesirable activity. 

[0045] More particularly, the direction of suspected anomalous area can be calculated 
and fed to the parameter of perspective viewing window so that the viewing window 
is automatically focused on the suspected area. Automatic zoom, pan and tilt 
adjustment can be conducted by first determining the center of the suspected area in 
the omnidirectional image by calculating the center of gravity of the suspected pixels 
as follows: 



DIFF = I 0 - Ii 



(5) 



N 



l 0 = 



1=1 



Jo - 




(6) 



N 



11 



80169-0031 (GNX-31) 



Patent Application 



A pin-hole model of the camera 30 is then used to trace the impinging point on the 
mirror of the projection ray that originates from camera's focal point and passes 
through the central pixel (io jo). The impinging point on the mirror is denoted as Mo. 
The normal of the perspective viewing window is then determined by using the 
projection ray that originates from the camera's focal point and passes through the 
impinging point Mo. This normal vector effectively defines the pan and tilt 
parameters of the perspective viewing window. The zoom parameter can be 
determined based on the boundary of the suspected pixel sets using the same ray 
tracing method. 

[0046] Using the omnidirectional images in change/motion detection is much more 
efficient than other change/motion detection schemes because the omnidirectional 
images contain optically compressed images of the surrounding scene. The entire 
area under the surveillance can therefore be checked in one operation. 
[0047] The system described above can be implemented using an omnidirectional 
image sensor such as the camera 30, with an acoustic sensor such as a selectively 
switchable microphone, directional microphone, or microphone array 104, so that the 
viewing direction of the perspective window can be adjusted to focus on, for example, 
a person speaking. This function is particularly useful in teleconferencing 
applications, where there is a need for detecting and focusing the camera toward the 
active speaker in a meeting. Combining the microphone array 104 with the 
omnidirectional image sensor 30, a voice-directed viewing window scheme and 
allows for automatic adjustment of a perspective viewing window toward the active 
speaker in a meeting based on the acoustic signals detected by an array of spatially- 
distributed microphones. A source of sound reaches each microphone in the array 
104 with different intensities and delays, allowing estimation of the spatial direction 
of a sound source using the differences in received sound signals among the 
microphones 104. The estimated direction of the sound source can then be used to 
control the viewing direction of any perspective viewing window. 
[0048] Figure 1 1 is a flowchart illustrates one embodiment of the procedures used to 
focus the perspective viewing window on an active speaker using the apparatus shown 
in Figure 10. First, the microphone array 104 is used to acquire a sound signal at step 
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110. As can be seen in Figure 1 1 , multiple numbers of microphones can placed along 
the periphery of the image unit to form the array. 

[0049] Next, based on the spatial and temporal differences among the sound signals 
received by the microphones in the array, the direction of the sound source is 
estimated at step 111. One possible method for conducting the estimation is as 
follows: if the acoustic signal detected by the k th microphone unit is denoted as s k , k = 
1,2,. . .n, the direction of an active speaker can be determined by the vector summation 
of all detected acoustic signals: 

V = s x v\ + s 2 v 2 + s 3 v 3 + — I- s n v n (7) 

[0050] Once the estimated direction of the sound source has been determined, the 
system determines the zoom, tilt and pan parameters for configuring the perspective 
viewing window based on the estimated sound source direction at step 112. The 
perspective viewing window position is then adjusted to face the direction of the 
sound source at step 113. 

[0051] The acoustic sensors 162 can be built-in with the omnidirectional camera or 
operated separately. The direction estimation signals need to be and preferably are 
fed into the host computer so that the omnidirectional camera software can use its 
input in real-time operation. 

[0052] Referring now to Figures 12 and 13, the inventive omnidirectional imaging 
system can include an image transmission system that can transmit images and/or data 
over the Internet. Figure 12 is a block diagram of the overall system incorporating an 
Internet transmission scheme, while Figure 13 is a representative diagram of a system 
architecture in an Internet-based omnidirectional image transmission system 
according to the present invention. This embodiment of the present invention uses a 
server 130 to provide the information communication services for the system. The 
server 130 simplifies the traffic control and reduces the load of entire networks, 
making it a more desirable choice than bridge or router devices. 
[0053] An Internet-based imaging system is particularly useful in medical 
applications to allows transmission of images or data of a patient to a physician or 
other medical practitioner over the Internet. The server provides additional 
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convenience, eliminating the need for the patient to know where to send his/her data 
and to send the data to more than one specialist, by allowing the patient to transfer the 
data package only once to the server with an appended address list. The server would 
then distribute the data package for the patient, reducing network traffic load and 
simplifying data transfer. 

[0054] Figure 14 is a representative diagram of the topology of the server 130 in the 
Internet-based imaging system of the invention. Clients 132 the server 130 may 
include patients, telemedicine users and practitioners, medical information 
visualization systems, databases, archival, and retrieving systems. The basic function 
of the server 130 is to manage the communication between its clients 132, e.g., 
receive, transfer, and distribute the medical signals and records, and control the 
direction, priority, and stream rate of the information exchange. From a client's point 
of view, the client only needs to send and/or receive date to/from the server to 
communicate with all designated service providers. 

[0055] In accordance with the preferred server architecture of the Internet-based 
imaging system, the communication protocol for the server 130 should include 
connection and data packages. The preferred connection protocol for the server is a 
"socket" protocol, which is an interface-to-internet application layer. As can be seen 
in Figure 14, the network design is a server/client structure having a "star-topology" 
structure. 

[0056] Programming task for a client/server communication application should 
include two components: a server program (Fig. 15a) and a client program (Fig. 15b). 
The tele-monitoring applications require the server program to be able to provide 
services for various clients, such as the patients, medical specialists, emergency 
services, and storage devices. To effectively use the server services, the client 
program should provide a proper interface in order to work with the server. By 
considering these requests, a structure of the program and the interface function of the 
client program is disclosed herein. 

[0057] Using object-oriented programming, the server program consists of an object 
of listening-socket class and many objects of client-socket class. Figures 15a and 15b 
show one example of possible flowcharts for the server program. Whenever a client 
makes a call to the server, the listening-socket object will accept the call and create a 
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client-socket object, which will keep the connection with the client and serve the 
client's request. When a client-socket object receives a package from its client, it will 
interpret the package and reset the communication status or deliver the package to the 
other client according to the request from the client. 

[0058] Beside the object-oriented function, the server also manages the traffic of 
communication among the clients. The server program makes up a table to store 
communication information of all the client socket objects, including connection 
status, client's name, group number, receiving inhabit bits, bridge status, and bridge 
owner. 

[0059] The server 130 can also provide simple services of database accessing. If 
there is any database provided by an application, the server could deliver the client's 
request to that application and transfer data back to the client. In order for the server 
130 to deliver or distribute the information to the correct client destinations, the data 
package format should include information about the destination, address of the 
client, the length of the data, and the data to be transferred. 

[0060] The inventive system may also include the capability to transfer video signals 
and images via the Internet. Note that some applications incorporating remote tele- 
monitoring do not require a video rate of image transmission, thereby making it 
possible to transmit high-resolution images directly as well as with both loss-less and 
lossy compression schemes. 

[0061] If desired, the inventive omnidirectional imaging system can be modified to 
provide two-way communication between the omnidirectional imaging system and a 
remote observer. This capability may be particularly useful in, for example, security 
applications. Figure 16 is a representative diagram of this embodiment. To provide a 
channel for two-way communication, the omnidirectional imaging system may 
incorporate a speaker 160 and microphone 162 at the same location as the camera 30 
and mirror 34. Audio signals are transmitted from the microphone 162 to a speaker 
163 located at a remote display device 164 on which an image is displayed using the 
perspective window W 40 explained above. The audio transmission can be conducted 
via any known wired or wireless means. In this way, the user can both watch the 
omnidirectional image and hear sounds from the site at which the omnidirectional 
image is being taken. 
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[0062] A second microphone 165 provided at the remote display 164 location can 
also be used to transmit audio signals from the remote display 164 location to the 
speaker 160 located with the omnidirectional camera system. In this way, a user can 
speak from the remote monitoring location and be heard at the omnidirectional 
camera system location. Note that the network providing this two-way audio 
transmission can be the Internet if the remote user is monitoring the output of the 
omnidirectional camera system via the Internet. 

[0063] Alternatively, the audio communication between the camera system and the 
remote monitoring location can be one-way communication as dictated by the 
particular application involved. For example, if the user only wishes to hear the 
sound at the camera system location (and not be heard), the camera system may only 
incorporate a microphone and not a speaker. The output of the microphone is then 
transmitted to the remote monitoring location and rendered audible to the user at that 
location as described above. 

[0064] While the invention has been specifically described in connection with certain 
specific embodiments thereof, it is to be understood that this is by way of illustration 
and not of limitation, and the scope of the appended claims should be construed as 
broadly as the prior art will permit. 
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